knowledge-intensive nlp task
NeurIPS Rebuttal for " Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks "
NeurIPS Rebuttal for "Retrieval-Augmented Generation for Knowledge-Intensive NLP T asks" We thank reviewers for their thoughtful, detailed reviews. "information retrieval strategy to improve the the generation Pre-trained seq2seq models have only become available in the last year (T5, BART) or two (GPT2). We study two RAG models. RAG-Sequence's formulation is similar to REALM, but RAG-Token is novel and Further, we explore novel decoding strategies for these models. "contribution [...] is not very specific, since R1 suggested that "A figure or example about P AG-Sequence Model and P AG-Token Model is needed", and R3 mentions "description of the model is quite concise (due to space restrictions)".
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks. However, their ability to access and precisely manipulate knowledge is still limited, and hence on knowledge-intensive tasks, their performance lags behind task-specific architectures. Additionally, providing provenance for their decisions and updating their world knowledge remain open research problems. Pre-trained models with a differentiable access mechanism to explicit non-parametric memory can overcome this issue, but have so far been only investigated for extractive downstream tasks. We explore a general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation. We introduce RAG models where the parametric memory is a pre-trained seq2seq model and the non-parametric memory is a dense vector index of Wikipedia, accessed with a pre-trained neural retriever. We compare two RAG formulations, one which conditions on the same retrieved passages across the whole generated sequence, the other can use different passages per token. We fine-tune and evaluate our models on a wide range of knowledge-intensive NLP tasks and set the state-of-the-art on three open domain QA tasks, outperforming parametric seq2seq models and task-specific retrieve-and-extract architectures. For language generation tasks, we find that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
Appendices for Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks A Implementation Details For Open-domain QA we report test numbers using 15 retrieved documents for RAG-Token models
For Open-domain QA we report test numbers using 15 retrieved documents for RAG-Token models. Thorough Decoding approach since answers are generally short. Decoding approach for RAG-Sequence models, as Thorough Decoding did not improve performance. Figure 4 shows the user interface for human evaluation. Annotators were encouraged to research the topic using the internet, and were given detailed instructions and worked examples in a full instructions tab.
NeurIPS Rebuttal for " Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks "
NeurIPS Rebuttal for "Retrieval-Augmented Generation for Knowledge-Intensive NLP T asks" We thank reviewers for their thoughtful, detailed reviews. "information retrieval strategy to improve the the generation Pre-trained seq2seq models have only become available in the last year (T5, BART) or two (GPT2). We study two RAG models. RAG-Sequence's formulation is similar to REALM, but RAG-Token is novel and Further, we explore novel decoding strategies for these models. "contribution [...] is not very specific, since R1 suggested that "A figure or example about P AG-Sequence Model and P AG-Token Model is needed", and R3 mentions "description of the model is quite concise (due to space restrictions)".
Review for NeurIPS paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Summary and Contributions: This paper proposes a retrieval augmented seq2seq model for question answering and related knowledge-intensive NLP tasks. The model is combination of a pre-trained BART and a dense passage retriever via joint probabilistic model. Two specific formulations, referred to as RAG-Sequence and RAG-Token, are proposed to let the model select relevant document(s) to generate answers. Experiments are conducted on a range of tasks including open-domain question answering and fact verification, showing that the RAG model achieve state-of-the-art or competitive performances. The design of the model share some similarity with REALM model, which is also a retrieval augmented encoder-only model.
Review for NeurIPS paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
This work proposed a system that uses the retrieval results of query to aid the generation of answers. The idea is generally natural and has been explored by quite some authors in various ways. The paper is clearly written, and I enjoyed reading it. I would see this work as a nice piece of work that combines several existing models in a neat although not strikingly novel or inspirational way. The major downside of the work is its novelty, but its strong empirical results and potential impact on practice are enough to support its acceptance by NeurIPS.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks. However, their ability to access and precisely manipulate knowledge is still limited, and hence on knowledge-intensive tasks, their performance lags behind task-specific architectures. Additionally, providing provenance for their decisions and updating their world knowledge remain open research problems. Pre-trained models with a differentiable access mechanism to explicit non-parametric memory can overcome this issue, but have so far been only investigated for extractive downstream tasks. We explore a general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation.
Facebook AI Releases KILT, A New Benchmark For Knowledge-Intensive NLP Tasks
AI researchers have made significant advancements in building models that can generate text that mimic the natural language. State-of-the-art technology performs so well that it is sometimes hard to distinguish their output from the text written by a person. An essential next step is to make these models generate the fluent and grounded text in real-world knowledge. KILT (Knowledge Intensive Language Tasks) helps AI researchers and enthusiasts build models that can better leverage real-world information to accomplish a broad range of tasks. Thus KILT becomes the first benchmark to aggregate the data sets representing such a wide variety of the knowledge-intensive tasks.